Abstract
Introduction: Acute Myeloid Leukemia (AML) is a biologically diverse disease. Expanded mutation panels and novel epigenetic assays are identifying an increasing number of putative AML subtypes beyond the traditional 'Good', 'Intermediate', and 'Poor' risk designations. Although these approaches show great promise, identifying the relevant underlying disease biology remains difficult. Single cell studies highlight this difficulty, showing dynamic interactions between multiple subclones, each with its own set of cooperating mutations interfering with normal hematopoiesis. We have previously shown that bulk ATAC data can be used to 'deconvolve' and identify hematopoietic state in AML samples. Here we extend this work, showing that this approach can be used to identify normal hematopoietic states with a high degree of accuracy. In addition, we show that AML samples that appear different in bulk actually contain overlapping lineage characteristics at the single cell level.
Methods: Single cell ATAC-seq count files were downloaded from GSE74310, GSE96769 as well as corresponding bulk ATAC-seq count files from GSE74912, GSE96771. These data were generated by flow sorting normal specimens into well-known stages of hematopoiesis followed by either bulk or single cell ATAC-seq. A set of AML samples was processed by both single cell and bulk ATAC as well. Bulk ATAC data was normalized using DESeq2 followed by variance stabilizing transformation. Single cell data was processed and normalized using the Seurat pipeline with default parameters. A common peak atlas was created for each dataset, and peaks characteristic of each stage of hematopoiesis were selected using a modified Kruskal-Wallis statistic and optimized using a set of well-characterized in-vitro sample mixtures. Lineage deconvolution was performed using a non-negative least squares regression comparing each unknown sample to the set of normal hematopoietic states.
Results: Dimensionality reduction of single cell ATAC-seq using uniform manifold approximation and projection (UMAP) largely recapitulates stages of hematopoiesis used to sort the samples (Figure 1a). Single cell lineage deconvolution is able to identify the purity of these populations more precisely (Figure 1b), with HSC, MPP, LMPP, CLP, GMP, MEP, and Monocytic stages showing relatively pure lineage characteristics. In contrast, the CMP stage appears to be composed of a heterogeneous population, as has been previously shown. Dimensionality reduction of bulk ATAC-seq data using Principle Component Analysis (PCA) illustrates distinct stages of hematopoiesis, and separates the AML samples into two groups (Figure 2c). To further analyze these groups, bulk lineage deconvolution was performed, showing that cluster 1 (purple) has a more differentiated appearance characterized by GMP and Monocyte lineages while cluster 2 also reflects earlier stages of hematopoiesis including HSC, MPP, and LMPP (Figure 2d). One sample from each cluster (highlighted in red in figure 2c,d) was evaluated using single cell ATAC-seq. Lineage deconvolution on the component cells illustrates substantial lineage characteristic overlap between subclones of these samples, with lineage based hierarchical clustering generating two clusters with mixed sample origin (Figure 2e). These clusters are separated into more and less differentiated lineage groups, with the cluster 2 sample cells more commonly having an HSC or MPP dominant lineage. However, some cluster 1 cells do have HSC or MPP lineage features as well, which is reflected by the poor association of cluster with sample (Fisher's exact p=0.8).
Conclusions: Lineage deconvolution can be performed on single cell ATAC-seq data with a high degree of precision on normal samples and illustrates clonal lineage heterogeneity in malignant specimens not previously appreciated in bulk sequencing analysis. Analysis of greater numbers of samples and cells are needed to draw general conclusions, but the approach shows promise as a means of computationally identifying or sorting normal single cells and more precisely characterizing leukemias.
Melnick: Janssen Pharmaceuticals: Research Funding; Sanofi: Research Funding; Daiichi Sankyo: Research Funding; Epizyme: Consultancy; Constellation: Consultancy; KDAC Pharma: Membership on an entity's Board of Directors or advisory committees. Elemento: Johnson and Johnson: Research Funding; Volastra Therapeutics: Consultancy, Other: Current equity holder, Research Funding; Eli Lilly: Research Funding; Janssen: Research Funding; One Three Biotech: Consultancy, Other: Current equity holder; Champions Oncology: Consultancy; Freenome: Consultancy, Other: Current equity holder in a privately-held company; Owkin: Consultancy, Other: Current equity holder; AstraZeneca: Research Funding. Levine: Lilly: Honoraria; Gilead: Honoraria; Janssen: Consultancy; Morphosys: Consultancy; Astellas: Consultancy; Roche: Honoraria, Research Funding; Incyte: Consultancy; Amgen: Honoraria; Celgene: Research Funding; Isoplexis: Membership on an entity's Board of Directors or advisory committees; C4 Therapeutics: Membership on an entity's Board of Directors or advisory committees; Prelude: Membership on an entity's Board of Directors or advisory committees; Auron: Membership on an entity's Board of Directors or advisory committees; Ajax: Membership on an entity's Board of Directors or advisory committees; Zentalis: Membership on an entity's Board of Directors or advisory committees; Mission Bio: Membership on an entity's Board of Directors or advisory committees; Imago: Membership on an entity's Board of Directors or advisory committees; QIAGEN: Membership on an entity's Board of Directors or advisory committees. Glass: GLG: Consultancy.